Goto

Collaborating Authors

 stephen king


What Was Grammarly Thinking?

The Atlantic - Technology

A short-lived AI tool promised to help users write like the greats--and a bunch of other random people, including me. T o me, the best first sentence of any piece of journalism is the one in Joan Didion's 1987 book,, which begins like this: "Havana vanities come to dust in Miami." I love that sentence and that propulsive first chapter so much that I once sat down to try to figure out how she did it. I looked at the sentences one at a time to assess what purpose each one was serving, and I counted how many of them Didion had needed to accomplish each thing she wanted to accomplish. Then I thought about how she figured out what order to put them in to have maximum page-turning impact.




The best new science fiction books of August 2025

New Scientist

In The End of the World As We Know It, other writers are telling stories set in the post-apocalyptic world of Stephen King's The Stand One of my most anticipated books of the year is out this month: a collection of short stories set in the post-apocalyptic devastation of Stephen King's The Stand. I love a good end-times story, and King did it so well in this doorstopper of a book, first published in 1978. How will the writers he has invited to develop his "world" fare? Suitably depressed by these visions of the future, I'm then planning to pick myself up with New Scientist columnist Annalee Newitz's cosier take, Automatic Noodle, which comes complete with jolly robots and cooking. From thrillers (Artificial Wisdom) to more literary takes (Helm), Star Wars to the latest from the prolific Adrian Tchaikovsky, let's get reading!


LLM Unlearning Should Be Form-Independent

Ye, Xiaotian, Zhang, Mengqi, Wu, Shu

arXiv.org Artificial Intelligence

Large Language Model (LLM) unlearning aims to erase or suppress undesirable knowledge within the model, offering promise for controlling harmful or private information to prevent misuse. However, recent studies highlight its limited efficacy in real-world scenarios, hindering practical adoption. In this study, we identify a pervasive issue underlying many downstream failures: the effectiveness of existing unlearning methods heavily depends on the form of training samples and frequently fails to generalize to alternate expressions of the same knowledge. We formally characterize this problem as Form-Dependent Bias and systematically investigate its specific manifestation patterns across various downstream tasks. To quantify its prevalence and support future research, we introduce ORT, a novel benchmark designed to evaluate the robustness of unlearning methods against variations in knowledge expression. Results reveal that Form-Dependent Bias is both widespread and severe among current techniques. We argue that LLM unlearning should be form-independent to address the endless forms of downstream tasks encountered in real-world security-critical scenarios. Towards this goal, we introduce Rank-one Concept Redirection (ROCR), a novel training-free method, as a promising solution path. ROCR performs unlearning by targeting the invariants in downstream tasks, specifically the activated dangerous concepts. It is capable of modifying model parameters within seconds to redirect the model's perception of a specific unlearning target concept to another harmless concept. Extensive experiments demonstrate that ROCR significantly improves unlearning effectiveness compared to traditional methods while generating highly natural outputs.


RULE: Reinforcement UnLEarning Achieves Forget-Retain Pareto Optimality

Zhang, Chenlong, Jin, Zhuoran, Yuan, Hongbang, Wei, Jiaheng, Zhou, Tong, Liu, Kang, Zhao, Jun, Chen, Yubo

arXiv.org Artificial Intelligence

The widespread deployment of Large Language Models (LLMs) trained on massive, uncurated corpora has raised growing concerns about the inclusion of sensitive, copyrighted, or illegal content. This has led to increasing interest in LLM unlearning: the task of selectively removing specific information from a model without retraining from scratch or degrading overall utility. However, existing methods often rely on large-scale forget and retain datasets, and suffer from unnatural responses, poor generalization, or catastrophic utility loss. In this work, we propose Reinforcement UnLearning (RULE), an efficient framework that formulates unlearning as a refusal boundary optimization problem. RULE is trained with a small portion of the forget set and synthesized boundary queries, using a verifiable reward function that encourages safe refusal on forget--related queries while preserving helpful responses on permissible inputs. We provide both theoretical and empirical evidence demonstrating the effectiveness of RULE in achieving targeted unlearning without compromising model utility. Experimental results show that, with only $12%$ forget set and $8%$ synthesized boundary data, RULE outperforms existing baselines by up to $17.5%$ forget quality and $16.3%$ naturalness response while maintaining general utility, achieving forget--retain Pareto optimality. Remarkably, we further observe that RULE improves the naturalness of model outputs, enhances training efficiency, and exhibits strong generalization ability, generalizing refusal behavior to semantically related but unseen queries.


Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance

Qorib, Muhammad Reza, Hu, Qisheng, Ng, Hwee Tou

arXiv.org Artificial Intelligence

Given news articles about an entity, such as a public figure or organization, timeline summarization (TLS) involves generating a timeline that summarizes the key events about the entity. However, the TLS task is too underspecified, since what is of interest to each reader may vary, and hence there is not a single ideal or optimal timeline. In this paper, we introduce a novel task, called Constrained Timeline Summarization (CTLS), where a timeline is generated in which all events in the timeline meet some constraint. An example of a constrained timeline concerns the legal battles of Tiger Woods, where only events related to his legal problems are selected to appear in the timeline. We collected a new human-verified dataset of constrained timelines involving 47 entities and 5 constraints per entity. We propose an approach that employs a large language model (LLM) to summarize news articles according to a specified constraint and cluster them to identify key events to include in a constrained timeline. In addition, we propose a novel self-reflection method during summary generation, demonstrating that this approach successfully leads to improved performance.


Says Who? Effective Zero-Shot Annotation of Focalization

Hicke, Rebecca M. M., Bizzoni, Yuri, Feldkamp, Pascale, Kristensen-McLachlan, Ross Deans

arXiv.org Artificial Intelligence

Focalization, the perspective through which narrative is presented, is encoded via a wide range of lexico-grammatical features and is subject to reader interpretation. Moreover, trained readers regularly disagree on interpretations, suggesting that this problem may be computationally intractable. In this paper, we provide experiments to test how well contemporary Large Language Models (LLMs) perform when annotating literary texts for focalization mode. Despite the challenging nature of the task, LLMs show comparable performance to trained human annotators in our experiments. We provide a case study working with the novels of Stephen King to demonstrate the usefulness of this approach for computational literary studies, illustrating how focalization can be studied at scale.


RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Jin, Zhuoran, Cao, Pengfei, Wang, Chenhao, He, Zhitao, Yuan, Hongbang, Li, Jiachun, Chen, Yubo, Liu, Kang, Zhao, Jun

arXiv.org Artificial Intelligence

Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's capabilities across various real-world applications. Regarding the forget set, we provide four four membership inference attack (MIA) methods and nine kinds of adversarial attack probes to rigorously test unlearning efficacy. Regarding the retain set, we assess locality and utility in terms of neighbor perturbation, general ability, reasoning ability, truthfulness, factuality, and fluency. We conduct extensive experiments across two unlearning scenarios, two models and six baseline methods and obtain some meaningful findings. We release our benchmark and code publicly at http://rwku-bench.github.io for future work.


Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models

Shi, Yucheng, Tan, Qiaoyu, Wu, Xuansheng, Zhong, Shaochen, Zhou, Kaixiong, Liu, Ninghao

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown proficiency in question-answering tasks but often struggle to integrate real-time knowledge updates, leading to potentially outdated or inaccurate responses. This problem becomes even more challenging when dealing with multi-hop questions since they require LLMs to update and integrate multiple knowledge pieces relevant to the questions. To tackle the problem, we propose the Retrieval-Augmented model Editing (RAE) framework tailored for multi-hop question answering. RAE first retrieves edited facts and then refines the language model through in-context learning. Specifically, our retrieval approach, based on mutual information maximization, leverages the reasoning abilities of LLMs to identify chain facts that na\"ive similarity-based searches might miss. Additionally, our framework incorporates a pruning strategy to eliminate redundant information from the retrieved facts, which enhances the editing accuracy and mitigates the hallucination problem. Our framework is supported by theoretical justification for its fact retrieval efficacy. Finally, comprehensive evaluation across various LLMs validates RAE's ability in providing accurate answers with updated knowledge.